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Abstract: This paper focuses on the prominent sphericity test when the dimension p 
is much lager than sample size n. The classical likelihood ratio test(LRT) is no longer 
applicable when p n. Therefore a Quasi-LRT is proposed and its asymptotic dis¬ 
tribution of the test statistic under both the null and the alternative hypothesis when 
p/n —>■ oo, n —^ oo is well established in this paper. We also re-examine the well-known 
John’s invariant test for sphericity in this ultra-dimensional setting. An amazing result 
from the paper states that John’s test statistic has exactly the same limiting distri¬ 
bution under the ultra-dimensional setting with under other high-dimensional settings 
known in the literature. Therefore, John’s test has been found to possess the powerful 
dimension-proof property, which keeps exactly the same limiting distribution under the 
null with any (n,p)-asymptotic, i.e. p/n —> [0, oo], n —► oo. All asymptotic results are 
derived for general population with finite fourth order moment. Numerical experiments 
are implemented to illustrate the finite sample performance of the results. 

Keywords and phrases: Sphericity test, Large dimension, ultra-dimension, John’s test, 
Quasi-likelihood Ratio Test. 


1. Introduction 

High dimensional data with dimension p of same scale with or even larger than the number 
of observations n has applausive statistical applications in biology and finance recently. In 
particular, practical needs for testing gene-wise independence in genomic studies have inspired 
a wide range of discussions regarding test of structures of the covariance matrix. 

In this paper, we consider the prominent sphericity test when the dimension p is much 
larger than the sample size n. Let X = (X\, X2, ■ ■ ■ ,X n ) be a p x n data matrix with 
n independent and identically distributed p— dimensional random vectors {Xj}i<j< n with 
covariance E = Var(Xi). Our interest is to test 

H 0 : S = a 2 I p vs. H x : E / a 2 I p , (1.1) 

where er 2 is an unknown positive constant. Among traditional tests are the likelihood ratio 
test(LRT) and John’s invariant test. 

Consider first the LRT with test statistic(Anderson (1984)) 

pn 

-21ogi„ = - 21 og(^j|n-lAL_) =nlog (_l_), (L2) 

where {^}i<i< p are the eigenvalues of p— dimensional sample covariance matrix ^ Yli =1 XiX[ = 
^XX', X = (Xi, • • • , X n ). If we let n -A 00 while keeping p fixed, classics asymptotic theory 
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indicates that under the null hypothesis and assuming the population is normal, 

-2 log L n 4 x\ p[p+l) _ v 

the chi-square distribution is further refined by the Box-Bartlett correction. However, this 
X 2 —convergence becomes slow when the dimension p increases so that the LRT (and its Box- 
Bartlett correction) is seriously biased when the dimension-to-sample size ratio p/n is not 
small enough. 

Wang and Yao (2013) made bias correction to the traditional LRT test under the regime 
where both p, n —> oo, p/n —>• c € (0,1). They derived that when X = {xij} i <i< P with i.i.d 

l<j<n 

entries satisfying E (xij) = 0, E|xjj| 2 = 1, ^4 := E|xjj| 4 < 00 , and under Hq, 

— — log L n + (p—n) log(l — —)—p 4 N f-^log(l - c) + 3 c, -21og(l - c) - 2c] . (1.3) 

n n \ 2 2 ) 


Notice that here the scale parameter cr 2 in Hq has been taken to be cr 2 = 1 as the LRT statistic 
is invariant under scaling. Extensive simulation study in Wang and Yao (2013) shows that 
this test is well adapted to high dimensions and has a very reasonable size and power for 
a wide range of dimension-sample size combinations (p, n ). The LRT however requires that 
p < n because when p > n, n — p of the sample eigenvalues {li} are null so that the likelihood 
ratio L n is identically null. In this paper, we introduce a quasi-LRT statistic which can be 
seen as a natural extension of the LRT statistic to the situation where p > n. The quasi-LRT 
test statistic is defined as 



(1.4) 


where {Ai}i<j< n are eigenvalues of n—dimensional matrix ^X'X. The main idea is that the 
companion matrix X'X has exactly the same n non-null eigenvalues with the sample covari¬ 
ance matrix XX'(yep to some scaling). Therefore, the quasi-LRT test statistic removes all 
the null eigenvalues in the original LRT test statistic and we find that under the so-called 
ultra-dimensional asymptotic p n, that is p/n —>• 00 and n —>• 00 , 


n n 2 

2 6 p 



A N{ 0,1). 


Based on this asymptotic result, a quasi-LRT test can be conducted to test sphericity to 
compensate for the inapplicability of the traditional LRT in the ultra-dimension setting. 

Next we consider John’s invariant test for sphericity. John (1971, 1972) studied the problem 
for normal populations and proposed the testing statistic 


U = —tr 

p 


(1 /p)tr{X) Ip 


p-'TLib-iy 


(1.5) 


where l = A Y/d=] h- It h as been proved that, as n —>• 00 while p remain fixed, the limiting 
distribution of U under Hq is 
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Contrary to the LRT, it has been noticed for a while that John’s test does not suffer from 
high dimensions and this x 2 limit is quite accurate even when the ratio p/n is not small. 
Ledoit and Wolf (2002) studied the (n, ^-consistency of this test statistic under normality 
assumptions. They proved that, when n,p —>• oo, liiUn^oop/n —>• c € (0, +oo), 

nU-pA (1.6) 


Meanwhile, when p —>• oo, 

p x l(p+ 1 )/ 2 —l ~P~^ N i 1,4). 

In other words, Ledoit and Wolf (2002) extended the classical n-asymptotic theory (where p is 
fixed) to the high-dimensional case where p goes to infinity proportionally with n. Meanwhile, 
the robustness of John’s test is explained in this proportional high-dimensional scheme. 

Wang and Yao (2013) further relaxed the normality restriction and proved that, if {xij} 
are i.i.d. with E = 0 , E|x ^| 2 = 1 , ^4 = E|xjj | 4 < 00 , then when n,p —>• 00 , lim n ,_ s . 0 O p/n —» 
c € ( 0 , + 00 ), 

nU -p 4 2,4). (1.7) 

Since ^4 = 3 for normal distribution, it shows that the existing results confirm with each 
other. In this paper, we extend the above result one step further, i.e. consider the asymptotic 
behavior of the John’s test statistic under the ultra-dimensional p> n setting. We find that 
this test statistic possesses a remarkable dimension-proof property, which shows that under 
the (n,p)-asymptotic, the limit in (1.7) still holds when linr^^oop/n = 00 . This dimension- 
proof property of John’s test makes it a very competitive candidate for sphericity testing 
regardless of p, n. 

Related methods have also been proposed in the literature for the high dimensional spheric¬ 
ity test. Noteworthy work include Schott (2005) where a test statistic based on the logarithm 
of the norm of sample correlation matrix under (n,p)-asymptotic has been well studied. Yet 
multivariate normality assumption has been assumed in this paper. Similarly in Fisher et al. 
( 2010 ), a novel test statistic utilizing the ratio of the fourth and second arithmetic means of 
the sample covariance matrix is developed under the p/n —>■ c, (n,p)-asymptotic with normal¬ 
ity restriction. Srivastava (2005) considered the ratio of arithmetic means of the eigenvalues of 
sample covariance matrix in the normal case when n = 0(p s ), S > 0, n,p —>• 00 and Srivastava 
( 2011 ) further proved the robustness of this test statistic against non-normality assumption 
irrespective of either n/p —> 0 or n/p —>• 00 . However, their results are only applicable under 
some specified factorized settings, which makes it less general than John’s test. Chen et al. 
(2010) developed a high-dimensional test based on the John’s test, however this test is very 
time-consuming (See Section 2.4). Zou et al. (2013) considered the multivariate-sign-based 
covariance matrices to construct robust test for sphericity and significantly enhanced test 
performance when the non-normality is severe, particularly for heavy tailed distributions. 
In their paper the asymptotic distributions of the test statistic when p = 0(n 2 ) is derived. 
Srivastava (2006) studied a quasi-likelihood ratio test under the n = 0(p 5 ), 0 < 5 < 1, 
n,p —>• 00 asymptotic in the normal case, while in this paper, the normality assumption is 
released and results are discussed under a wider range of (n,p)-asymptotic. These tests are 
compared in the simulation studies of the paper in Section 2.4. 

The rest of the paper is organized as follows. Section 2 discusses the asymptotic behavior of 
the John’s test statistic and the quasi-LRT test statistic under the ultra-dimensional setting. 
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Empirical sizes and powers of these two tests and other methods are compared under various 
scenarios. Section 3 presented theoretical results for power of John’s test and quasi-LRT test 
and testified these results with simulations. Section 4 concludes. Some technique lemmas and 
related proofs are displayed in the Appendix A. 


2. New tests and their asymptotic distributions 
2.1. Preliminary Knowledge 

For any n x n Hermitian matrix M with real eigenvalues Ai, ■ ■ ■ , A n , the empirical spectral 
distribution (ESD for short) of M is defined by F AI = n _1 j ■ where 5 a denotes the 

Dirac mass at a. The Stieltjes transform of any distribution G is defined as 

mo(z) = /- dG(x ), 3(z) > 0 , 

J x - z 

where 3(z) stands for the imaginary part of z. 


Consider the re-normalized sample covariance matrix A = 


'Pi 1 

nip 


X'X — I n ], where X = 


( Xij) p xn and Xij,i = 1 , ■ ■ • ,p, j = 1 , • • ■ , n are i.i.d. real random variables with mean zero and 
variance one, I n is the identity matrix of order n. It’s known that under the ultra-dimensional 
setting (Bai and Yin, 1988), with probability one, the ESD of matrix A, F A converges to the 
semicircle law F with density 


1 


f\x) = { ^V 4 - x2 ’ if M ^ 2 ’ 
0 , if |x| > 2 . 


We denote the Stieltjes transform of the semicircle law F by m(z). Let X' denote any open 
region on the complex plane including [—2, 2], the support of F and .M be the set of functions 
which are analytic on SA. For any / € . denote 

n f 1 — 777,^ 

f(x)d(F A (x) - F(x)) - — (b f (-m - m- 1 ) Xn(m) -— dm, (2.1) 

-OO 2vr Ij\ m \ =0 m 2 


where 


Xn(m) = 


-B + VB 2 - 4AC 
2 A ’ 


1 n , 


A = m — \j —(1 + m 2 ), 


n 2 , m 


B = m z — 1-m(l + 2m 2 ), C = 

p n 


m 


1 — m 2 


+ v 4 — 2 - 


-m 


174 = EA^i and y/B 2 — 4AC is a complex number whose imaginary part has same sign as that 
of B. The integral’s contour is taken as |m| = p with p < 1. Chen and Pan (2013) gives a 
calibration in advance for the mean correction term in ( 2 . 1 ), where only C is replaced with 


^Calib _ _ 


m 

n 


v 4 — 2 + 


m 


i n 


1 — m 2 


— 2 ( z / 4 — 1 )m\l — 


— \ / —m 
P 
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while others remain the same. 

The central limit theorem (CLT) of linear functions of eigenvalues of the re-normalized 
sample covariance matrix A when the dimension p is much larger than the sample size n 
derived by Chen and Pan (2013) is stated as follows. 

Theorem 2.1. Suppose that 

(a) X = (xij)pxn where {xtj : i = 1, • • • ,p; j = 1, ■■■ ,n} are i.i.d. real random variables 
with EXu = 0 , EX^ = 1 and ^4 = EXq < 00 . 

(b) n/p —>■ 0 as n —>• 00 . 

Then, for any / 1 , - - - , f k € , the finite dimensional random vector (G n (fi), ■ ■ ■ ,G n (fk )) 

converges weakly to a Gaussian vector ( Y(fi), ■ ■ ■ , Y(fk)) with mean function E Y (/) = 0 and 
covariance function 

OO 

cov (Y(fi),Y (/ 2 )) = (i / 4 - 3)$i(/i)$i(/ 2 ) + 2 ^ k<$> k (fi)$ k (f 2 ) (2.2) 

k =1 

= ^2 / 2 J 2 f^ x )f^(.y) H ( x ^y) dxd y 


where 


$*(/) = ^ J W /(2 cos 0)e ike dd = f (2 cos 9) cos k0 dO, 

\ ( ON r -2/7-2,0! /4 - xy + v /(4 - x 2 )(4 - y 2 )\ 

= (^4 — 3)v4 — x 2 \J \ — y 2 + 2 log -- == ■ 

y 4 — xy — a/ (4 — x 2 )(4 — y 2 ) J 

The proofs of the main theorems in this paper are based on two lemmas derived from this CLT. 
Notice that the limiting covariance functions in (3.1) has been first established in Bai and Yao 
(2005) for Wigner matrices. 


Lemma 2.1. Let {A,, 1 < i < n} be eigenvalues of the matrix A = 


-\-X'X-I n 
n \ p 


where X satisfies the assumptions in Theorem 2.2, then as p/n —>• 00 , n —>• 00 , 


£^1 A? - n - (1/4 - 2 ) 

EE 1 A* 


4 N 


4 0 

0 Z ^ 4 — 1 


Lemma 2.2. Let {A*, 1 < i < n} be eigenvalues of matrix A = 

X satisfies the assumptions in Theorem 2.3, then as p/n ^ 00 , n —>■ 00 , 

( E?= 1 ^ 


— f -X'X — I n ), where 

n\p ) 


£EEiM 1 W?)+W? + Wi + 


— f>n T Op( 1), 


where 


1^4 — 1 


("4 * 1) (l + ?) 

(^4 - 1) (l + f) 4.4 - 1 + f(2i44 - 1) 
The proofs of these two lemma are postponed to Appendix A. 
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2.2. John’s Test 


Consider John’s test statistic U defined in (1.5) based on eigenvalues of the p —dimensional 
sample covariance matrix S = f-XX'. Here we assume that the X'm in X have representation 
Xj = Y}/ 2 Zj, where , Z n } = {%'}i<i< P .i<j<n is a p x n matrix with i.i.d. entries 

Zij satisfying E(zy) = 0, E {z 2 -) = 1. It can be seen that, under the null hypothesis H 0 , the 
John’s test statistic is independent from the scale parameter a 2 . Therefore, we assume w.l.o.g. 
o 2 = 1 when we derive the null distribution of the test statistic. In other words, under Hq, 
we assume in the rest of this paper that sample vectors {xij}i<i< p ,i<j<n satisfy E (xij) = 0, 
E( xfj) = 1, E(|xjj| 4 ) = u 4 < +oo. The first main result of this paper is the following. 

Theorem 2.2. Assume X = {xtj} pxn are i.i.d. satisfying E(xjj) = 0, E(x? ) = 1, E|xjj| 4 = 
v 4 < oo, then when p/n —>• oo, n —>• oo, 

nU -pA N(u 4 - 2,4). 

Similarly with this theorem, Wang and Yao (2013) shows that if {x^} are i.i.d. with E = 
0, E|x,j| 2 = 1, u 4 = E|1 4 < oo, then when n,p —» oo, lim^^oo p/n —> c € (0,+oo), 

nU-pAN(is4 -2,4). 

It indicates that as long as X = {xij} pX n are i.i.d with zero mean, unit variance and fi¬ 
nite fourth order moment, John’s test statistic nU — p has a consistent limiting distribution 
N(y 4 — 2,4), regardless of normality, under any (n,p)-asymptotic, n/p —» [0,oo). Therefore, 
the powerful dimension-proof property assigns John’s test top priority when little information 
about the data is known before implementing sphericity test. 


The proof of Theorem 2.2 is based on Lemma 2.1. 

Proof. Denote the eigenvalues of pxp matrix S n = \XX' in descending order by Zj(l < i < p), 

fp (1 \ 

and the eigenvalues ofnxn matrix A = \j — —X'X — I n 1 by Aj(l < * < n). Since p > n, 

V n \P J 

S n has p — n zero eigenvalues and the remaining n non-zero eigenvalues /,(1 < i < n) are 
related with Aj(l < i < n) eigenvalues of A as 


V V 

—A i -|— = li, 1 < i < n. 
n n 


We have, for John’s test statistic 

( 


U = 


n z \ V n 


t± P A\-A^ 


- 1 


p r —' n \ V P 
1=1 ' ’ 


E?=iA| + 2 1 /rEr.,A. + f » 


■E?.1 A. + V7 


2 


- 1 , 
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Define the function G(u, v ) = 


u + 2v\/% + p 


■ — 1, then John’s test statistic can be written as 


> + VnY 

( n n \ 

i=1 i= 1 / 

According to Lemma 2.1, when p/n —> oo, n —>■ oo, 




4 0 

0 Z/4 — 1 


Then by the Delta Method, 

n(lJ — G(u,v)\ u = n j rVi _i, v =o) = in + o p ( 1) 

where 


{„~JV[0, »’VG| * t W<y'), 




dU dU 
and VG = —, — 

\ ou on 

\ / u=n-\-v 4~2, v =0 

We have, for ( u , u) = (n + z^i — 2,0) 


is the corresponding gradient vector. 


and 


p ua — 2 

G = —I-, 

n n 


VG[ ’ ” ) VG / = 4+ 4(t/4 1) (l + ^^ 

U 1/4 — i / np \ n 


0 


The conclusion thus follows. 


□ 


2.3. Quasi-likelihood ratio test 

Consider the Quasi-LRT statistic C n in (1.4) based on the eigenvalues of n—dimensional 
matrix ^X'X, which are also proportional to the non-null eigenvalues of p— dimensional sam¬ 
ple covariance matrix -/XX'. Similarly with John’s test statistic, it can be seen that, under 
the null hypothesis Hq, the C n statistic is independent of the scale parameter a 2 . Therefore, 
we again assume w.l.o.g. cr 2 = 1 when we derive the null distribution of the test statistic. The 
second main result of this paper is the following theorem. 

Theorem 2.3. Assume X = {xij} pxn are i.i.d. satisfying E(xjj) = 0, ¥.(x 2 -) = 1, E|x ^| 4 = 
z ^4 < oo, then when p/n —>• 00 , n —>• 00 , 


2 


(2.3) 
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Recall the classic LRT when Hq holds and p is fixed while n —>• oo, if the population is 
Gaussian, the test statistic 


—2 log L n = n log 


F 


n? =1 u 


d 2 

'^|p(p+ i)—i ’ 


where {li}i<i< p are the eigenvalues of p— dimensional sample covariance matrix ^XX'. Here 
we notice that n/p —>• oo. 

By interchanging the role of n and p, which is feasible under Hq , it can be seen that when 
n fixed and p/n —> oo, the test statistic 


-2 log L p = p log 


T 


II , u 


d 2 

^ '^5?i(n+l)— 1 ’ 


{h}i<i<n are the eigenvalues of n —dimensional sample covariance matrix i X'X . Note that 
(—21ogL p ) /n coincides with our Quasi-LRT statistic C n . Heuristically, if next we let n —>• oo, 
then 


xL 


(n+l )—1 n + 1 d 


A N (0,1), 


n 2 

which is nothing but (2.3) applied to the normal case (y± = 3) with fixed n and p —>• oo. 
Therefore, the classical LRT can be thought of as a particular “finite-dimensional” instance 
of the general limit of (2.3) for the Quasi-LRT, that is, Theorem 2.3 covers a wide range of 
“large p, small n” situations. 

The proof of Theorem 2.3 is based on lemma 2.2. 

Proof. Denote the eigenvalues of n x n matrix j } X'X in descending order by l t (l < i < n ), 

and eigenvalues of n x n matrix A = {^X'X — I^j by Aj(l < * < n). These eigenvalues 

are related as 

jfl ~ 

\ — Xi + 1 = k, 1 < i < n. 

V p 

We have, for the Quasi-LRT test statistic 
n n 2 z /4 — 2 

n ~2 ~6p 2 


P , 

= - log 
n 


= P log ^1 + 
Define the function 


- n \ 10 / n 

lEQ /W 


i= 1 / 

fnf 1 

pin 


n n 

2 6 p 


1/4 — 2 


i= 1 


^ !°g ( 1 + 


i=l 


n n 


Xi 1 2 6 p 


1/4 — 2 


G(u,v) = plog ( 1 + 


n (1 


p \n 

then the Quasi-LRT test statistic can be written as 


u - 


n n 2 v 4 — 2 


2 6p 


2—1 


2 — 1 




p 6 p V p 


n 1/4 — 2 
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According to Lemma 2.2, when p/n —>■ oo, n —>■ oo, 
/ £"=i A* 


fEILiiog i + V? +Wt + Wi + 


P 6p v p 2 y p 


— £n + Op(l)j 


where 


Then by the Delta Method, 
n n 2 1/4 — 2 


1/4 — 1 


(-4 -1) (1 + ?) 


(^4 — 1) ( 1 + f ) Z/4 — 1 + ^(2 z/4 - 1) 


Cn 2 6p ' 


where VG = 


-G( U ,r)K 0 A1V 0, VG 


v 4 — 1 


("4 - 1) (l + j) 


(z/4 — 1) (1 + ^) ^4 it f(2^4 — 1) 


9G SIT 
du du 


is the corresponding gradient vector. 


il = 0 , v =0 

We have, for (u, v) = (0,0), G = 0 and 


VG 


1/4 — 1 


("4 - 1) (l + j) 


(^4 “ 1) (l + f) ^4 “ 1 + F*^^ - 1) 
Therefore, when p/n —t 00 , n —>• 00 , 

n n 2 1/4 — 2 d 


VG' = 1. 


2 6p 


A AT (0,1). 


□ 


£.4. Simulation Studies 

In order to further explore the finite sample behavior of John’s sphericity test when dimen¬ 
sion p is significantly larger than the sample size n, Monte Carlo simulations are implemented 
in this session to evaluate the size and power of John’s Sphericity Test. Test statistic proposed 
by Chen et al. (2010) is also considered for comparison. 

In the simulation, without loss of generality, we conduct the sphericity test with cr 2 = 1. 
To find the empirical sizes of these two tests, we consider two different scenarios to generate 
sample data: 

(1) {Xj}, 1 < j < n i.i.d p-dimensional random vector generated from multivariate normal 
population N(0,I p ), Exh = u 4 = 3; 

(2) {xij, 1 < i < p, 1 < j < n} i.i.d follow Gamma( 4,2) — 2 distribution, then E Xij = 0, 
Exfj = 1, E xfj = 1/4 = 4.5. 

We set sample size n = 64, dimension p = 320,640, 960,1280,1600, 2400, 3200 in order to 
understand the effect of an increasing dimension. The nominal test level is a = 0.05. For each 
pair of (p,n), 10000 replications are used to get the empirical size. 


VG' 
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For John’s test, we reject Hq if nil — p exceeds the 5% upper quantile of N(i/4 — 2,4) 
distribution. For Quasi-LRT test, we reject Hq if C n — t | — exceeds the 5% upper 

quantile of iV(0,1) distribution. 

As for the test in Chen et al. (2010), the test statistic is defined as follows: 


U n =p 


T 2 ,n\ 

T ln) 


- 1. 


where 


T l,n = 

n r - ' 


2—1 


1 

p2 

± n 


T.YX,, 


1 

^ 2 ■ ri p2 

r n 


Y, {x ' x i) 2 


4 E X'WjX* + 4 E XiXjXiX,, 

r n „■ „■ I, ' r n „■ „■ i, ; 


where P ^ = n\/(n — r)!, denotes summation over mutually different indices. Then we 
reject FTo if nU n exceeds the 5% upper quantile of N( 0,4) distribution. 

For the test in Srivastava (2011)(Sri for short), the test statistic is defined as follows: 


W n = 


n 

2 


Cn 


. 1 - ^(trS) 2 ] 



- 1 


where S = ^XX', c n = ^ n _-^ n+2 ) ■ According to the limiting distribution of W n , we reject 
Hq if W n exceeds the 5% upper quantile of N( 0,1) distribution. As for empirical powers, we 
generate sample data from two alternatives: 

- Power 1: E is diagonal with half of its diagonal elements 0.5 and half 1. This power 
scenario is denoted by Power 1; 

- Power 2: E is diagonal with 1/4 of its diagonal elements 0.5 and 3/4 equal to 1. This 
power scenario is denoted by Power 2. 

Table 1 reports the empirical sizes and powers of two tests for Gaussian data. Table 2 is 
for Non-Gaussian data. 


Table 1 

Scenario 1 for Gaussian Data 


(p,n) 

Sri 

Size 

Chen John 

QLRT 

Sri 

Power 1 

Chen John 

QLRT 

Sri 

Power2 

Chen John 

QLRT 

(320, 64) 

0.048 

0.0539 

0.0492 

0.0998 

0.9571 

0.9532 

0.958 

0.9777 

0.6155 

0.6117 

0.6194 

0.7352 

(640, 64) 

0.0504 

0.0538 

0.0515 

0.0668 

0.9595 

0.9542 

0.9602 

0.9638 

0.6089 

0.6065 

0.6128 

0.6562 

(960, 64) 

0.0532 

0.0581 

0.0544 

0.062 

0.9598 

0.9569 

0.9604 

0.9647 

0.6201 

0.6144 

0.6231 

0.6482 

(1280,64) 

0.0519 

0.0603 

0.053 

0.0568 

0.9609 

0.9569 

0.9615 

0.9656 

0.6076 

0.6043 

0.6129 

0.6256 

(1600,64) 

0.0529 

0.0571 

0.0539 

0.0593 

0.9583 

0.9539 

0.9588 

0.9627 

0.6194 

0.6146 

0.6231 

0.6378 

(2400, 64) 

0.0493 

0.0536 

0.0501 

0.0506 

0.9588 

0.9542 

0.9591 

0.9615 

0.6171 

0.6099 

0.621 

0.6291 

(3200, 64) 

0.0472 

0.0538 

0.0481 

0.0503 

0.9617 

0.9576 

0.9624 

0.9625 

0.6212 

0.619 

0.6251 

0.6301 


It can be seen from the above results that both John’s test and QLRT perform well with 
respect to sizes and powers. Empirical powers under Power 1 are in general higher than under 
Power 2 because of more significant difference between Hq and H\. John’s test performs 
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Table 2 

Scenario 2 for Non-Gaussian Data 


(P,n) 

Sri 

Size 

Chen John 

QLRT 

Sri 

Power 1 

Chen John 

QLRT 

Sri 

Power2 

Chen John 

QLRT 

(320, 64) 

0.1828 

0.0584 

0.0566 

0.1084 

0.9909 

0.9476 

0.9538 

0.9701 

0.8374 

0.6044 

0.6196 

0.7299 

(640, 64) 

0.1875 

0.0594 

0.0598 

0.0735 

0.9927 

0.9566 

0.9603 

0.9653 

0.8379 

0.6051 

0.6201 

0.6601 

(960, 64) 

0.1869 

0.058 

0.0551 

0.0631 

0.9923 

0.9524 

0.9589 

0.9608 

0.8394 

0.6121 

0.6298 

0.6502 

(1280,64) 

0.1856 

0.057 

0.0517 

0.0605 

0.9927 

0.9529 

0.9599 

0.962 

0.8483 

0.6133 

0.6206 

0.6416 

(1600,64) 

0.1811 

0.0555 

0.0536 

0.058 

0.9925 

0.9557 

0.9622 

0.9642 

0.8433 

0.6143 

0.633 

0.6407 

(2400, 64) 

0.179 

0.0581 

0.0533 

0.0564 

0.991 

0.9497 

0.9567 

0.9577 

0.8425 

0.611 

0.6261 

0.6304 

(3200, 64) 

0.1757 

0.0518 

0.0503 

0.0522 

0.9909 

0.9529 

0.961 

0.9611 

0.8413 

0.6143 

0.6266 

0.6319 


slightly better than Chen’s method. In all tested scenarios, the QLRT dominates the other 
two tests in term of power even though the difference is quite marginal. Srivastava’s test 
performs slightly below John’s test in the Gaussian case and still suffers from non-normality 
with non-negligible bias. Furthermore, we have recorded the execution time of these two tests 
within different scenarios and we find that Chen’s method is more time-consuming due to 
more complicated computations. 

3. Power of the tests 

In this section we study the asymptotic power of the two tests. To begin with, some pre¬ 
liminary knowledge is introduced as follows. 

3.1. Preliminary knowledge 

Consider the re-normalized sample covariance matrix 



where Z = ( Zij) pxn and z- L j , i = 1, ■ ■ ■ ,p, j = 1, • • • ,n are i.i.d. real random variables with 
mean zero and variance one, I n is the identity matrix of order n, T, p is a sequence of p x p 
non-negative definite matrices with bounded spectral norm. Assume the following limit exist, 

(a) 7 = limp^oo ifr(Sp), 

(b) 0 = limp^oo 

(c) u = hmp^oo i E?=i( s ^) 2 > 

it has been proven that, under the ultra-dimensional setting (Bai and Yin, 1988), with prob¬ 
ability one, the ESD of matrix A, F A converges to the semicircle law F with density 

F'(x) = ( " x2 ’ if M ^ 2 ’ 

[ 0 , if |s| > 2 . 

We denote the Stieltjes transform of the semicircle law F by m(z). Let 5? denote any open 
region on the complex plane including [—2, 2], the support of F and .M be the set of functions 
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which are analytic on 5?. For any / 6 . denote 

GnU) = n J + °° f{x)d (f\x) - F{x)) - 
where, for any positive integer k, 

< M/) = 7 ^: f f (2cos(0)) cos(k9) dd. 

Limiting theory of the test statistics under the alternative H\ is based on a new CLT for 
linear statistics of A, provided in Li and Yao (2016), as follows. 

Theorem 3.1. Suppose that 

(1) Z = ( Zij) pxn where {zij : i = 1, ••• ,p; j = 1, ■■■ ,n} are i.i.d. real random variables 
with E = 0 , E zfj = 1 and = E zfj < oo; 

(2) (S p ) is a sequence of p x p non-negative definite matrices with bounded spectral norm 
and the following limit exist, 

(a) 7 = linip^oo ± tr(T, p ), 

(b) 9 = limp^oo i tr(Tip), 

(c) uj = limp^oo i Er=i( s ^) 2 / 

(3) p/n —>• oo as n —>• oo, n 3 /p = 0(1). 

Then, for any /i, - - - , ff € , the finite dimensional random vector (G n (fi), - • • ,G n (fk )) 

converges weakly to a Gaussian vector (Y(/i),--- ,Y(fk)) with mean function 


E Y(f) = i (/(2) + /(—2)) - i$ 0 (/) + ^4 - 3)$ 2 (/), 
and covariance function 


OO 

cov(Y(f 1 ),Y(f 2 )) = j(u 4 ~3)^ 1 (f 1 )^ 1 (f 2 ) + 2^kMh)Mh) (3.1) 

k =1 

= ^2 J / Ji( x )f2(y) H ( x ,y)dxdy, 


where 


$*(/) = f (2 cos 0 )e ike dd = ^-J K f(2cos9)cosk0d9, 


H(x,y) 


^ ( 1/4 - 3) a/ 4 - x 2 \/4 - y 2 + 2 log 

U 


d-xy + y 7 (4 x 2 )(4 - y 2 )^ 

4 - xy - y/4 — x 2 )(4 - y 2 ) J 


The proofs of Theorem 3.2 and 3.3 about the power of the two test statistics are based on 
two lemmas derived from this CLT. 
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Lemma 3.1. Let{\i, 1 < i < n} be eigenvalues of matrix A = 
where Z, E p satisfies the assumptions in Theorem 3.1, then 



, 1 = Z'ELZ - 
) P 


tr(E p ) j \ 


Efc=l (?J^4 - 3) + 1) 

LIU ^ 


An 


o 

[y 4 — 3) + 2 


as p/n —>• oo, n —>• oo, n 3 /p = 0 ( 1 ), 

Lemma 3.2. Let{Xi, 1 < i < n} be eigenvalues of matrix A = 
where Z, E p satisfies the assumptions in Theorem 3.1, then 



j 1 , Z'ELZ - 


Mgp) t 


( 


[p 

V n 


log 



Er=i a. 


\/P"log(7) + 2 ^ 


+ 


e 2 

2^ 


37 a / 


n~ 

P 


+ 


0 

2^ 


+ 2 ^ 3 - (^4 - 3 


where 



— A- Op{ 1), 


% Aa - 3) + 2 

(w ("4 - 3) + 2) (f + ^2) 


(? (^4-3) + 2)(f + ^j) \\ 
(fK-3)+2)g , (%^4~3)+5)0 2 n I I 
T 2 T^P / / 


as p/n —>■ oo, n —>• oo, n 3 /p = 0 ( 1 ). 

The proofs of these two lemma are postponed to Appendix A. 


3.2. John’s test 

Suppose that an i.i.d. p— dimensional sample vectors X\, ■ ■ ■ ,X n follow the multivariate 
distribution with covariance matrix E p . To explore the power of John’s test under the al¬ 
ternative hypothesis H\ : E p & 2 Ip, we assume that the X'-s in X have representation 

Xj = Zj, so as S = AS P / 2 ZZ'S P /2 , where Z = {Z t ,- ■ ■ , Z n } = {zij}i<i< p ,i<j< n is a 
p x n matrix with i.i.d. entries z l3 satisfying E (zij) = 0 , E(z?-) = 1 and E(|zjj| 4 ) = < +oo. 

Then John’s test statistic is 

rr P-^UA-I? 

I 2 

where 1 < * < p} are eigenvalues of the p —dimensional sample covariance matrix S = 
jY}J 2 ZZ'xAj 2 . The main result of the power of John’s test is as follows. 

Theorem 3.2. Assume X\, • • • , X n are i.i.d. p— dimensional sample vectors follow multivari- 
ate distribution with covariance matrix E p , X = £ p Z where Z = {%} is a p x n matrix 
with i.i.d. entries satisfying E (zifi) = 0 , E (z^) = 1 , E|zjj | 4 = is. 4 < 00 , £ p is a sequence of 
p x p non-negative definite matrices with bounded spectral norm and the following limit exist, 

(a) 7 = lim^oo A fr(E p ), 

(b) 9 = linip^oo |tr(S 2 ), 

(c) u = linip^oo A Ya=i Ah) 2 , 


•e IS 
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then when p/n —>• oo, n —>• oo, n 3 /p = 0 ( 1 ), 


nU — p — 




{ 6 + oj(v 4 — 3) 
\ 7 2 



Note that the theorem above reveals the limit distribution of John’s test statistic under 
alternative hypothesis H\. Nevertheless, if let = cr 2 I p , then 7 = a 2 , 9 = ui = a 4 , Theorem 
3.2 reduces to Theorem 2.2, which states the null distribution of John’s test statistic under 
H 0 . With the two limit distributions of John’s test statistic under Hq and Hi, power of the 
test is derived as below. 


Proposition 3.1. With the same assumptions as in Theorem 3.2, when p/n —>• 00 , n —>• 
00 , n 3 jp = 0(1), the power of John’s test 


Pjohn(Hi) = 1 - 


,7 2 K-2 )-e 

[T Za + 29 


— 3) 


(7 2 -e)n \ 
29 J 


where a is the nominal test level, Z a , <&(•) are the alpha upper quantile and cdf of standard 
normal distribution respectively. 

For John’s test statistic U, under Hq, 


nU -pA N(v 4 - 2,4), 


under H 1 , 


nU — p — 



n 4 N 


9 + ui(yi — 3) 



( nU — p — (n 4 — 2) 
/3john(tfl) = P -^ 


Hi 


= p 


'nU-p-n^-l) - 0 + ^r 3) ^ 2Z a + (1/4 - 2) - n ( 4 , - l) 

nfl 7 o/i 


6+u(i/4— 3 )' 


2 9 

7 


29 

i 1 


*,7 2 v , 7 2 (^ 4 - 2 )- 6 »-u;(i/ 4 - 3 ) ( 7 2 - 9)n 

-i-^\-z a + - + 29 


According to Jensen’s inequality, y 2 < 9 and equality holds only when = u 2 I p , Proposition 
3.1 thus follows. 


The proof of Theorem 3.2 is based on Lemma 3.1. 
Proof. Denote the eigenvalues of p x p matrix S n = 

{li, 1 < i < p}, and eigenvalues of n x n matrix A 


1 

n 


XX’ = ^ ZTipZ’ in descending order by 



by {A*, 1 < i < n}. Since p > n, S n has p — n zero eigenvalues and the remaining n non-zero 
eigenvalues U are related with A* as 


— tr(Yl)\i + — tr(T, p ) = Zj, 1 < i < n. 
n y n 
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We have, for John’s test statistic 


i =1 


U = ( ( \ll + V, tr 


n 


i =1 


^E(\/^ tr(s ? )Ai+ ^ r(s '’ ) )) - 1 


n 


0Er=iA? + 2 7V / ^Er=iA i +P7 2 


- 1. 


| £?=1 A * + Vn - 


0tt + 2 7 y v + p 7 2 

Define function G(u,v) = - v —-1, then John’s test statistic can be written 


as 


i\[l v + Vni) 2 

/ n n 

U = G\u = Y j A 2 ,u = ]TZ 


2—1 


According to Lemma 3.1, when p/n —>• oo, n —>• oo, rD/p = 0(1), 


i=l 

3 _ 


E?_i *?-»-(*(■*-3)+ i) \ 4 

E”.i a, j 


Then by the Delta Method, 

" (v - G(»,OI„.„ +f (, 4 -3 )+1 ,„.o) 4 w (o. " 2 VG ( 


4 0 

0 %{y A - 3)+ 2 


4 0 

0 ^ (z/ 4 — 3) + 2 


where VG = 


90 9£T 

<9u’ 


VG'), 

is the corresponding gradient vector. 


i=ra+^(i/ 4 — 3)+l,^=0 


We have, for ( u , u) = (n + ^(^4 — 3) + 1,0), 


P 0 (w (~ 3) + 0) 

YJ "I Q J- 1 

n 7 Z n 7 z 


and 


VG 


0 


40 2 


ui, 


0 }^-3) + 2 j VG '-^V + ( « (i ' 4 * 3 ) + 2 


40 (0 + oj( i^4 — 3) + nQy 
7 6 n 3 p 


The result thus follows. 


□ 


5 . 3 . Quasi-likelihood ratio test 


Consider the Quasi-LRT statistic C n in (1.4) based on the eigenvalues of n—dimensional 
matrix ^X'X. Similarly with John’s test statistic, it can be seen that, under the alternative 
hypothesis H i, the C n statistic can be represented as 


£ 


n 



(sE Uh) 

II I 


n 


where {U, 1 < i < n} are eigenvalues of ^Z'T, p Z. The main result of the power of the 
Quasi-LRT test is as follows. 
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Theorem 3.3. With the same assumptions as in Theorem 3.2, when p/n —>• oo, n —>• oo ; 
n 3 /p = 0(1), 


£ 


n 



0 \/ 0 \ n 2 \ 
37 3 / P J 


An 


9 u 

2 ^ + 2 ^ {U4 ~ 3) ’ 



Note that the theorem above reveals the limit distribution of the Quasi-LRT statistic under 
alternative hypothesis H\. Nevertheless, if let = er 2 / p , then 7 = cr 2 , 0 = 00 = cr 4 , Theorem 
3.3 reduces to Theorem 2.3, which states the null distribution of the Quasi-LRT test statistic 
under Hq. Similarly, with the two limit distributions of QLRT statistic under Hq and H 1 , 
power of the test is derived as below. 

Proposition 3.2. With the same assumptions as in Theorem 3.2, when p/n —» 00 , n —» 
00 , n 3 /p = 0(1), the power of QLRT /3qlrt(Hi) is 


l-$ 


7 


Z a + 



n + 


/ 7 2 9 \/#\ n 2 /y 2 (^4 — 2) — 9 — 00(04 — 3) 

\§9~ Tf + J ~p + V 20 


1) 


where a is the nominal test level, Z a , <!>(•) are the alpha upper quantile and cdf of standard 
normal distribution respectively. 

For QLRT statistic £, under Hq, 


r n n2 d at 

£, j — — — — —> N 

2 6 p 


04 — 2 


,1 


under H 1 , 


Cn 2 7 2n 


9y/f 

3 7 3 


£i,N 

p 


2 7 2 + 2 7 ' 


w ( 0 2 

— (04 - 3), — 


7 


/SQLRT(^l) = P ( £ n - ^ ^ 


H 1 


A 


= P 




T* 


> 




2 TP 77, 


(^-^)f-(^ + ^^- 3 ))' 




= 1 - ^ 


7 


A* + 


7 2 - 0 
20 


n + 


' 7 2 0 y /0 


+ 


n 


60 2 7 2 3 'yip 


+ 


7 2 (i ^4 — 2) — 0 — w (^4 — 3) 
20 


since 7 2 < 0, Proposition 3.2 follows. 


The proof of Theorem 3.3 is based on lemma 3.2. 
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Proof. Denote the eigenvalues of n x n matrix | X'X = 1 Z'Y> V Z in descending order by 


Zi(l < z < n), and eigenvalues of n x n matrix A = 
Aj(l < i < n). These eigenvalues are related as 


1 ' J- - Z'E P Z - tr J-^L l n | by 


I ntr(T,p)~ 1 


A i H— tr{Yip) = k, 1 < i < n. 


O ' "" \^p 

pZ P 


We have, for the Quasi-LRT test statistic 


C n = - log 
n 


' n \ n /n 

»S'v /S f< 

n \ 


= p**h +\^ 


np 


i=l 


I nO' 


i= 1 




Define the function 


G(u,u) = plog I 7 + \l —u 1 - \l-v, 


np 


then the Quasi-LRT test statistic can be written as 


jC n = G l u = J2 K 


V = 


i =1 


£l°g(7 + J—Xi 


i —1 


p 


According to Lemma 3.2, when p/n —>• oo, n —>• oo, n 3 /p = 0(1), 


fEILiiog i + Wf -v^iog(7) + 


e? =1 a* 

0 /f? I / / 0 2 0%/e\ n 2 I 0 


27 2 V p T 


^ + ^+0^4-3 


27 4 37 4 / p 27 2 27 ' 


— T °p(l); 


where 


By the Delta Method, 


£ (*4 - 3) + 2 (f (l / 4 - 3) + 2) (f + s) 

(f ( 1 / 4 -3) +2) (f + ^) (?^-3)+ 2 )° + (^(^4-3)+5)0 2 n 


C n G(u,v) l u= 0 i „ = ^ log( 7 ) _^y^_((^_|^)^ + _i i+ ^ r(l/4 _ 3) ) v ^ 


JV 0, VG 


2 K - 3) + 2 (£ ( 1/4 - 3) + 2) (f + ^s) 

(|(, 4 -3) + 2)(f + ^) (y^-3 )+ ^ + (¥ ( ^ + ^n 


VG' , 
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where VG = 


dU dW 

du ’ du 


is the cor- 

,=o,,= v ^iog( 7) -^yf-((^-|^)^ + ^ + ^(, 4 -3)) v f 

responding gradient vector. 

Then we have, for (to, v) = (o, ^/pn log( 7 ) - ^~ ((|p ~ f^r) y + 4? + 2 ^(^ 4 “ 3 )) \j~ 


0 

2 ^ 


G(u,v) = n+ [ 7TIJ - 


0 2 0V0 


2y 4 37 s 


n 2 0 oj . 

7 + ^2 +^ 2 ( ^ 4 - 3) ’ 


and 


VG Cov (u, v ) VG' = 


0 2 


1 


4 ‘ 


The result thus follows. 


□ 


5-4- Simulation Experiments 

Empirical power of the two tests are shown in this section to testify the theoretical re¬ 
sults presented in Proposition 3.1 and 3.2. Specifically, we consider two different scenarios to 
generate sample data: 

(1) {Zj, 1 < j < n} i.i.d p— dimensional random vector generated from multivariate normal 
population N p (0,I p ), E zfj = ^4 = 3 , Xj = Sp /_ Zj, 1 < j < n; 

(2) {zij, 1 < i < p, 1 < J < n} i.i.d follow Gamma( 4,2) — 2 distribution, then Ezjj = 0, 
E zfj = 1, E 2 : 4 . = v A = 4.5. X pxn = y}J 2 Z pxn . 

To cover multiple alternative hypothesis, is configured as a diagonal matrix with elements 
0.5 and 1. The proportion of “1” is 5. The nominal test level is set as a = 0.05. (p, to) = 
(2400, 64) and empirical power are generated from 5000 replications. Theoretical values are 
displayed for comparison. 


Table 3 

Empirical and Theoretical Power of two tests 



Gaussian 

Non-Gaussian 

<5 

John’s 

test 

QLRT 

John’s 

test 

QLRT 


Empirical 

Theory 

Empirical 

Theory 

Empirical 

Theory 

Empirical 

Theory 

0 

0.046 

0.050 

0.049 

0.050 

0.051 

0.050 

0.052 

0.050 

0.1 

0.738 

0.745 

0.727 

0.759 

0.736 

0.746 

0.727 

0.761 

0.2 

0.958 

0.953 

0.954 

0.959 

0.950 

0.954 

0.951 

0.960 

0.3 

0.984 

0.979 

0.982 

0.982 

0.981 

0.979 

0.981 

0.982 

0.4 

0.978 

0.976 

0.978 

0.980 

0.978 

0.976 

0.978 

0.980 

0.5 

0.958 

0.953 

0.958 

0.959 

0.951 

0.954 

0.950 

0.960 


It can be seen from Table 3 that the empirical and theoretical power coincide with each 
other and both tests have very large power even when 6 is small. 


4. Discussions and Auxiliary Results 


In summary, we found in the considered ultra-dimension (p to) situations, QLRT is the 
most recommended procedure regarding its maximal power for sphericity test. However, from 
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the application perspective where the dimension p and n are explicitly known, it becomes 
very difficult to decide which asymptotic scheme to use, namely, “ p fixed, n —>• oo”, “p/n —>• 
c € (0, oo), p,n —>• oo”, or “p/n —>• oo, p,n —> oo” etc. Combining our study with the existing 
literature, we would like to recommend a dimension-proof procedure like John’s test or Chen’s 
test, with a slight preference for John’s test as it has a slightly higher power and an easier 
implementation. 

We conclude the paper by mentioning some surprising consequence of the main results of 
the paper as follows. 


Corollary 4.1. Assume X = {xij} pxn are i.i.d. satisfying E (xij) = 0, K(x'f-) = 1, E| Xij 
iq. < oo, then when n/p —> oo, n,p —>• oo, 


— log L n 
P 


P 

2 


P 


6 n 



4 jv(o,i). 


where —^log L n = -logf^— r ). {k}i <i< p are the eigenvalues of p— dimensional sample 
covariance matrix ^XX'. 

Note that if we fix p while let n —>• oo, under normality assumption, the Corollary 4.1 reduces 
to 

--login - N (0,1) , 

p 2 

which is consistent with the classic LRT asymptotic, i.e. — 21ogL n A x‘f , 

Corollary 4.2. Assume X = {xij} pxn are i.i.d. satisfying E (x^) = 0, E(x?.) = 1, E|xjj| 4 = 
< oo, then when n/p —>• oo, n,p —>• oo, 


nU -p 4 JV(i/ 4 - 2,4). 


Proof. Interchanging the role of n and p in Theorem 2.2, keeping all other assumptions un¬ 
changed, it can be seen that, when n/p —>• oo, n,p —>• oo, 


pU-n A N(v 4 ~ 2,4), 


where 


U = 


n 


— 1 j2 

Z^i=i h 


1 v n /'• 

1 "W 


2 


Zi(l <i<n) are eigenvalues of nxn matrix If are eigenvalues of ^XX f : then 


P /2 

rr n -^=1 ^2 

pU — n = — -— — p — n 


1 
n 


xr= i *. 


n /2 

p L i TT 

-—- n — p = nu — p. 

i 


□ 


Henceforth, the dimension-proof property of John’s test statistic, i.e. regardless of normality, 
under any (n,p)-asymptotic, n/p —» [0, oo], has been completely testified. 
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Appendix A: Technique Lemmas and additional proofs 


Lemma A.l. In the central limit theorem of linear functions of eigenvalues of the re¬ 
normalized sample covariance matrix A when the dimension p is much larger than the sample 
size n derived by Chen and Pan (2013), Let SP denote any open region on the complex plane 
including [—2,2], the support of the semicircle law F(x), we denote the Stieltjes transform of 
the semicircle law F by m(z). Let be the set of functions which are analytic on 1/, for 
any analytic function f € M., the mean correction term is defined as 

1 — m 2 


2 ^£,|. p 

Define functions /i(x) = x 2 , /^(x) = x, fz(x) = ^log(l + y^hr), then the mean correction 
term in equation (2.1) for these functions are as follows: 


n 

2 n i 


n 

2 iri 


F _ 2. — 777 , z 

T f fi (~m - m _1 ) Xn ahh ( m ) - — dm = 1/4- 2 , 

Pj\m\= P rn 2 

h = °' 

P 2 _ 777 ,^ 

h {-m - m _1 ) X° allb (m)- —dm 

l J\m\=p m 2 


va — 2 n 2 

- -1-. 

2 3 p 


Proof. Since 


C a iib / \ a - B + VB 2 - 4ACCaHb n 

Xn b M =- 24 -’ A = m-^-(l+ m 2 ), 


o n n r* ru m 

B = m 2 - 1 - —m(l + 2m 2 ), C Callb = — 
V n 


m I n 

1/4 ~ 2 + -- K - 2 ( 1/4 - l)m 

1 — m 2 


-m 


the integral’s contour is taken as |m| = p with p < 1. 
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For /i (x) = x 2 , choose P < y'% < y/%, 

l ... 

n\=p 

(1 — m 4 )(l + m 2 ) X 


n 


in ^ i u 1 — m 2 
-n , ,Calib/"^,\ dm 


0 - f fi {-m - m 3 ) Xn {rn) - f 

2 iriJ\m\ =p m z 


n 

2 ni 


\m\=p 


m ’ 


1 — „ -m m — .. m 
p J V p 


dm 


^denote X := — £> + \J B 2 — T4,C Callb 

1 + ttt 2 -X 


n 


1 


2ttz ,/j m | =p m 4 


1 — ' / p m J 


dm 


(Cauchy’s Residue Theorem) 

1 



\ / 

/ 

/dm 3 

+ d 

// 

ra=0 V 


nX 




For f 2 (x) = x, choose p < , 


1 — m 2 


^£,=/ 2(_ ”'" rl)x? " b(m): » 2 

= LU 


(—m — m 1 ) 


2 iri j\ m \ = p 


X 


-dm 

1 1 — m 2 


1 —,/^mJ m ~\/^ m 


dm 


X 


n r 1 

2vri/| m | =p rn 3 ^ _ /a 


m m — 


dm 


= -id( 2 ) 
2 ! 


nl 




dm' 


= 0. 


j dm 

m=0 


m=0 
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For f 3 (x) = £ 

J 


i»), choose p < ,/f < */£, 


n 


IN nn 1 — m 2 
-l\ ,.Cahtv, m A dm 


0 : f h {~m - m 4 ) Xn ( m )- 2 

27TZ /| m |_p 777- 


= y~- log 

« 00 

= —/ E 

27Tz/| m | =p ^ _ 



1 — m 2 


-dm 


m^ 


1 — m 2 


1 — . -m m-J'-i 


1 


^\m\ = 


\m\=p . 


dm 


n rm 




3 

2\3 


+ 


+ 


m’ 


, 2\2 


1 — m 4 n (1 — m 4 )(l + m 2 ) n Fn (1 — m 4 )(l + m 2 ) 


m J 


+ n 2 (1 — m 4 )(l + m 2 ) 


4 p 


m u 






1 — . -m m-J 5 


dm + 0 — 



P 


XI 

2 71 "* yimi = 


\/np- o + 

m a 


M=P . 

n 2 1 + 3m 2 + 2m 4 

+— 


4p 


m u 



According to Cauchy’s residue theorem, we have 


1 


l 


1 


n ■ r 3 

2 m / H=p m 6 


X 


1 — . -m ) m-J £ 


similarly, 


Ai v .„ 

2vrij| m | = p _ 


n 

+— 

4 p 


m u 



dm 


= 0, 


- 0 - 1/4-2 _ 0 + p2 + 1/4 ~ 2 - 0 - ( 2t/4 ~ 3 ) n _ (F4 - 2)n 2 _ 3 ( 1/4 - 2)n _ Q + Q 


va — 2 rr 

—-1-ho 

2 3p 


3n 


P 


4p2 


4 p 



n 

V 



n 

P 


□ 


Proof of Lemma 2.1 : 
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Proof. According to Theorem 2.1, define function fi(x) = x 2 , then 

f+ oo 


G, 


/ +OO 

fi{x)d ( F a (x ) - F(x)) 

-OO 

n r+oo ^ _ 

= ^ A 2 — n x 2 ■ — \J\ — x 2 dx 

i =1 ^ 

n 

= 2 A- - n, 


i=1 


where F" 4 is ESD of A = - (X'X — pl n ) and F represents the semicircular law. The mean 

y/np 

correction term for fi(x) = x 2 is, according to Lemma A.l, 

n f , i, ^ 1 — m 2 

t; —t f fi {-m - m x ) (m)- -—dm = u 4 - 2, 

2 mJ\ m \ =p m 2 

As for the mean function and covariance function of the Gaussian limit Y(/i), since 


L (/i) = J 4cos 3 0d0 = 0, 


4*2 (/i) = — [ 4 cos 2 9 cos 20 dO = — [ (cos 40 + 1 + 2 cos 20) d0 = 1, 
2tt._^ 2ttj- 


3>fc(/i) = — [ 4 cos 2 0 cos k6 d0 = — [ 2(cos 20 + 1) cos k6 dQ 
2vr J_ n 27 rj_ n 

1 /' 7r 

= — / (cos(fc — 2)0 + cos(fc + 2)0 + 2 cos kO) d0 = 0, for k > 3, 

J-n 

therefore Var(Y(fi )) = 4, in addition, E(Y(/i)) = 0, Conclusively, we have, when p/n —>■ oo, 
n —> oo, 

n 

^A 2 -n-(^-2)4lV(0,4). 

Z=1 

Similarly, if we define function f 2 = x, then 

/ +oo 

/ 2 (x)d(F A (x)-F(x)) 

-OO 

n ^ r~\~o o ^ ^ 

= A i — n x ■ — v^4 — x 2 dx = V A*. 

^ •'-«> 27r S' 

The mean correction term for / 2 (x) = x is, according to Lemma A.l, 


” / /2 (_ m _ m -l) x Calib (m) 

2vT7y| m | =p 


1 — m 2 

- 7 t— dm = 0, 

m 2 
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As for the mean function and covariance function of the Gaussian limit Y(/ 2 ), since 

$ o (/2 ) = 27 t ./ 2cos6ld6l = 0 ’ 

$1U2J = 


4> 


(^ 2 ) = 27 tJ 2cos2 ° de = 1 ' 

1 r 

(A) = 2 *j_„ 

H(h)= 

1 r 

= — / (cos(fe + 1)0 + cos(fc — 1)0) d0 = 0 for k > 3 

2 ^J-7T 


1 r, 

2 cos 0 cos 20 d0 = — / (cos 30 + cos 0) d0 = 0, 
2 tt /— 


2 cos 0 cos kd d 0 


therefore 


Var(G n (f 2 )) = ( 1/4 - 3)$i(/ 2 )$i(/ 2 ) + 2 ^ M> fc (/ 2 )<f> fc (/ 2 ) 

fe=i 

= 1 / 4 - 1 , 

in addition, E(Y(/ 2 )) = 0. In conclusion, we have, when p/n —>• 00 , n —> 00 , 

d 

Ai — n — 1^4 — 

n 

d 

^i 

i =1 


^A?-n-(i/ 4 -2)4lV(0,4), 


^ A 4 jv(o,i/ 4 -i). 

2=1 

Now consider the covariance between G n (fi) and G n (/ 2 ), then 

OO 

Cov(G n (f 1 ),G n (f 2 )) = (^4-3)$ 1 (/ 1 )$ 1 (/ 2 ) + 2^A : $ fc (/ 1 )^(/ 2 ) = 0. 


fe=i 


Consequently, when p/n ^ 00 , n —>■ 00 , 

Ya=i — n — ( 1/4 — 2 ) ^ d, 

EILi a, 


A A" 


4 0 

0 1/4 — ! 


□ 


Proof of Lemma 2.2 : 

Proof. According to Theorem 2.1, define function / 3 (x) = £ log ^1 + ■y/f x ) > th en 

f + OO 


G, 


/ TOO 

f 3 (x)d(F A (x) - F(x)) 

-OO 

= «l? log ( 1 +A< vf)“”/I 


log 1 + , / —x • F(x) dx 
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where F A is ESD of A = —— (X'X — pl n ) and F represents the semicircular law. 

y/np 


n 


J — log ^1 + • F(x)dx = p J log ^1 + y yjd — x 2 dx 


n™ (2k + 1)!! 

2^ Ok-l( 


2 ^ 2 k ~ 1 (k + l) 2 (k + 2) \p ) 


4n\ 


n n 

~ 2 ~ + ° 


n 

2 p 


The mean correction term for fs(x) = ^ log ^1 + is, according to Lemma A.l, 


Tl f r / Calih/ \ ^ ^ i ^4 — 2 Tl 

7 W- f h {-m -m l )Xn M dm =-n-H—, 


m* 


2 

3? 


27T2 J| m |—p 

As for the mean function and covariance function of the Gaussian limit Y(fs), since log(l + 

*) = i(-ir +1 f, 


$ 


(/a) = j h (2 cos 0) • cos 9 d 9 

if /p 

2vr ]_ n \n 

+ ^ /E vf ' (2 c “ 9)3 ' cos 9 " + ° (vf. 


2cos0-cos@d0- [ - ■ (2 cos 0) 2 • cos 9 d9 

2ir I —2 


+ 


+ o 


for k > 2, 


®kUz) = 7^ J /s(2 cos 0) cos k9 d9 


iLil 


1 pn 


2ir J_ n 3\p 


+ 


= 0 1 w- I + < 

p 


2 cos 0 • cos d$ — — [ — ■ (2 cos 0) 2 • cos k9 d9 
27T ./—-rr 2 


(2 cos #) 3 • cos k9 dO + o 


2 k — 2 

l Fn U — Q 

3 y P ^ — J 

0 Jfe > 4 


therefore 


Var(G n (h )) = (^4 - 3)$i(/ 3 )$i(/ 3 ) + 2 ^ k<S> k (h)d> k (h) 


- (l/ 4 _ 1 ) (\/f + \/f) +2 ' 2 '(“^) +2 ' 3 'E^ 

= (1^4 — 1) • — + 2z/ 4 — 1 H—(^4 — o)j 
n p 6 


2 
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in addition, K(Y(f 3 )) = 0, Conclusively, we have, when p/n —>• oo, n —>• oo, 


i= 1 




n n 2 V 4 — 2 d 


n 


—> TV I 0 , — (1/4 — 1 ) + 2^4 — 1 H— 


n 


P 


If we define function f 2 = x, it has been proved in Lemma 2.1 that, 



/ -(-OO 71 

f 2 (x)d(F A (x)-F(x)) 

-00 


The mean correction term for f 2 (x) = x is, according to Lemma A.l, 

n [ , ,. r, ... 1 — m? 

— f f 2 ( m -m l ) Xn M- —dm = 0, 

2vr*/| m | =p m 2 

As for the mean function and covariance function of the Gaussian limit Y(f 2 ), 


Mh) = j ~f 2 cos 6 dd = 0, 

<S>i(h) = -^J W 2 cos 2 6 d6 = l, 

1 r 

$ 2 ( 72 ) = — / 2 cos 6 cos 26 dO = 0, 

2 tt J_ n 

1 p 

d>k(f 2 ) = — / 2 cos 0 cosd# = 0 for k > 3, 

^ J-K 


Var(G n (f 2 )) = (z / 4 - 3)$i(/ 2 )$i(/ 2 ) + 2 ^ (/ 2 )<E» fc (/ 2 ) = 1/4 - 1, 

fc=i 

in addition, E(y(/ 2 )) = 0. In conclusion, we have, when p/n —>• 00 , n —>• 00 , 

n 

A* —> IV(0, ^4 — 1). 

2=1 

Now consider the covariance between G n (f 3 ) and G n (f 2 ), then 

Cov(G n (f 3 ), G n (h)) = (1/4 - ^(/a)^) + 2 k$ k (h)Q k (f 2 ) = K - 1) + ^1) • 

Consequently result follows. □ 

Proof of Lemma 3.1: 

Proof. According to Theorem 3.1, define function f \ (x) = x 2 , then 
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G, 


/ -hOO . „ 

fi (x) d (f a (x) — F (x 

-OO ' 

n _ /*2 ~,2 

= E x FnJ 

i=l J 2 


f n° 


— x 2 dx — 


~ \/—$3 (/ 1 ) 


-$3 (/l) 


i r 


E A ?- n “ \/7 $ 3 (/l)> 


Z=1 


where i 7 " 4 is the ESD of A = \/^ j .—= - Z'T, p Z — — •/„ ] and i 7 represents the 


, _M S ?) V tr ( s ?) 

semicircular law. 

As for the mean function and covariance function of the Gaussian limit Y (/i), since 




= f fi(2cos(d))d6 = 2, 

(/l) = -^ J fl (2 cos (0)) cos (A;0) 

Gn (/i) = Er= i \ 2 - 


id0 = 


0, jfe = 1, 

1, fc = 2, 

0, k > 3, 


E (/i)) = ^ (/i (2) + fi (-2)) - ^$0 (/i) + | fa ~ 3) $ 2 (/i) 
= 2 - 1 + — (z^4 - 3) = — (z^4 — 3) + 1, 


Var (Y (fi)) = - ( 1/4 - 3) (/,) + 2 ^ M> 2 fe (/i) = 4. 

fc=i 

Similarly, if we define function (x) = x, then 

G n (fi) = n j f 2 (x)d (-P 4 (x) - F (x)j - ^-$3 ^ 

n r 2 

z=l 17 ^ 

n 

= EA 


-x 2 dx- \l—$3 (fi) 


-4*3 (/2) 




As for the mean function and covariance function of the Gaussian limit Y (f 2 ), since 

$0 (/2) = ^f /2(2 cos(^ = 0, 
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$k C/2) = J h ( 2 cos (#)) cos (&#) d# 

(/ 2 ) = YJi=l^i-, 


1, fc = 1, 
0, A: > 2, 


E (y (/ 2 )) = l (/2 (2) + h (-2)) - ^0 (/ 2 ) + J fa - 3) $2 (/ 2 ) 

= — 2 ^° ~ l ~ 0 " _ ^ 2 = 


Var (Y (/ 2 )) = ^ (^4 - 3 ) + 2 , 


Cov{Y(h),Y{f 2 )) 

therefore 


£ (^4 - 3 ) (/1) (/ 2 ) + 2 £ k$ k (/1) <f> fc (/ 2 ) = 0 , 

k =1 


E” = iAf-n-(^4-3) + l) 

EILi E 


4iV 2 



0 

7 (^4 - 3) + 2 


as n 


00, p —>■ 00, p/n 3 = 0 ( 1). 


□ 


Proof of Lemma 3.2: 


Proof. According to Theorem 3 . 1 , define function (x) = 




then 




y/ A — x 2 dx 



— -JA — x 2 dx 
2tt 


= P log 7 - p 

= P log 7 - 


0 n 2 1 Ai-i , f 2 0 - n 2 4 1 r -- 

/ 7 T^- X TTY x dx - p —— \/A-X' L <Ax + > 

J-2 27 P 2vr y_ 2 47 4 p 2 2vr 


# 2 n 2 


2 7 2 


-Tl - 


2 7 4 p 


2vr 

+ o 



n 

P 
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$0 {f 3 ) 



d t 


P 

n 


log 7 + 


2 vr 


/ 


2 VO 


— 7T 7 


cos t d t 



d t 



8 eVe 

37 s 


(cost ) 3 


dt + o 





{f 3 ) 


1 f n p . ( n fnO 

^Ln log r +2 v7 

-t 

In J_ 


cos t cos kt dt 


27r n 


1 


log 7 cos kt dt + 

2 V 0 


1 r 

^ J_, 


p 

2 tt J_ n n 


log 1 + 


2 VO 


cos t 


+ 


2vr J_ v 7 
86VO 

—7T 37 s 


/ p 1 f n 26 

— cos t cos kt di - / -rr (cos t) 2 cos kt dt 

n 2vr J _ w 7 2 ^ ' 


hi 




— (cos t) cos fet dt + o — 




k = 2 , 
k = 3, 
k > 4, 


cos kt d t 


thus 
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eVon 2 

37 s p 


y«l>3 I./:’,) 


E (Y (f 3 )) = ± (f 3 (2) + h (-2)) - ^0 (/a) + j (*4 - 3) (/ 3 ) 


(s log ( 


nd \ p , I I nd 

7 + 2W — + — log 7 — 2 

\ p I n V y p 



-if§logT-^)-^N-3)^ 


2 V n 


__ 0 _"_ / _ o3 

2 7 2 2 7 2 3) 


Var (Y (/ 3 )) = j ( 1/4 - 3) (/ 3 ) + 2 ]T ^ (/ 3 ) 


fc=i 

= >- 3 >f-#+^„ 

0 \7V n 7 6 V p 



+ 2 *lp + tg 

\ 7 V 7 

= (? ^ - 



n \ / 0 2 \ 0 3 n 

+ 4 '47J +( Vp 


8 \ ( 9 3 


n 


0 p (2lo \ 9 2 (u ^ x „ 

+ (.T (l/4 - 3 > + 5 ) 7 + (? <" 4 - 3) + z) -yip- 


Consider function /2 (x) = x, from lemma 3.1, we have 


{f 2 ) = — / ht 2 cos W) cos d6) = ' 


0 , k = 0 

1 , k = 1 

0 , k > 2 


therefore the covariance between Y (/ 2 ) and Y (/ 3 ) is 

OO 

Cou (Y (/ 2 ), Y (/ 3 )) = j (z/ 4 - 3) $1 (/ 2 ) $1 (/ 3 ) + 2 £ fc<h fc (/ 2 ) (/ 3 ) 


k =1 


“(l (l/ 4 — 3) + 2 ) + Tvp 



consequently the result follows. 


□ 










